Offload leaf search work to AWS Lambda functions by fulmicoton-dd · Pull Request #6157 · quickwit-oss/quickwit

fulmicoton-dd · 2026-02-12T12:49:18Z

Summary

The goal is to handle traffic spikes gracefully without provisioning additional searcher nodes: when the local search queue is saturated, overflow splits are transparently routed to Lambda for processing.

How offloading works

The offloading decision happens on the leaf side, inside the SearchPermitProvider. The permit provider already manages a bounded queue of pending split search tasks (gated by memory budget and download slots). When a leaf search request arrives, the provider checks the current queue depth against a configurable offload_threshold. If granting permits for all requested splits would exceed this threshold, only enough splits to fill up to the threshold are processed locally — the rest are marked for offloading.

The offloaded splits are batched (up to max_splits_per_invocation splits per batch, balanced by document count) and sent to Lambda in parallel. Each Lambda invocation runs the same leaf search code path and returns per-split results individually. This is important: the per-split responses are fed back into the IncrementalCollector and populate the partial result cache, so subsequent queries hitting the same splits benefit from cached results regardless of whether the split was searched locally or on Lambda.

Auto-deployment

Depending on the configuration, the Lambda function code can be deployed automatically at startup. The quickwit-lambda-client crate embeds a compressed Lambda binary at compile time. When auto_deploy is configured, Quickwit will:

Check if a published Lambda version matching the current binary already exists (identified by a description tag quickwit:{version}-{hash})
Create or update the function and publish a new version if needed
Garbage-collect old versions (keeping the current one + 5 most recent)

This ensures the Lambda function always matches the running Quickwit version without any external deployment tooling. Manual deployment is also supported for users who prefer to manage Lambda functions through Terraform or other IaC tools.

Configuration

Lambda offloading is opt-in. Add a lambda section under searcher in the node configuration:

searcher:
  lambda:
    offload_threshold: 100     # queue depth before offloading kicks in (0 = always offload)
    max_splits_per_invocation: 10
    auto_deploy:
      execution_role_arn: arn:aws:iam::123456789012:role/quickwit-lambda-role
      memory_size: 5 GiB
      invocation_timeout_secs: 15

New crates

quickwit-lambda-client: Handles Lambda invocation (with metrics) and auto-deployment logic. Embeds the Lambda binary at build time.
quickwit-lambda-server: The Lambda function handler itself — receives a LeafSearchRequest, runs multi_index_leaf_search, and returns per-split LeafSearchResponses.

Key changes in existing crates

quickwit-search: New LambdaLeafSearchInvoker trait; SearchPermitProvider gains get_permits_with_offload to split work between local and offloaded; leaf.rs orchestrates local and Lambda tasks in parallel.
quickwit-config: New LambdaConfig and LambdaDeployConfig structs under SearcherConfig.
quickwit-serve: Initializes the Lambda invoker at startup when configured.
quickwit-proto: New LeafSearchResponses wrapper message for batched per-split responses.

Copilot

Pull request overview

Adds an opt-in AWS Lambda “overflow” execution path for leaf split search to handle traffic spikes without adding more searcher nodes, including auto-deploy of the Lambda binary and per-split result integration into the existing partial result cache / incremental merge flow.

Changes:

Introduces quickwit-lambda-client (invocation + auto-deploy + metrics) and quickwit-lambda-server (Lambda handler running Quickwit leaf search).
Extends searcher configuration/context to support Lambda offloading, and updates leaf-search scheduling to split work between local permits and Lambda batches.
Adds protobuf support for batched per-split responses plus docs and CI workflow for publishing the Lambda binary.

Reviewed changes

Copilot reviewed 43 out of 45 changed files in this pull request and generated 13 comments.

Show a summary per file

File	Description
quickwit/quickwit-storage/src/cache/memory_sized_cache.rs	Adds a regression test for `CacheConfig::no_cache()` behavior.
quickwit/quickwit-serve/src/lib.rs	Initializes Lambda invoker on startup when searcher+lambda are configured.
quickwit/quickwit-serve/Cargo.toml	Adds dependency on `quickwit-lambda-client`.
quickwit/quickwit-search/src/tests.rs	Updates tests for `SearcherContext::new(..., lambda_invoker)` signature changes.
quickwit/quickwit-search/src/service.rs	Extends `SearcherContext` to carry an optional Lambda invoker.
quickwit/quickwit-search/src/search_permit_provider.rs	Adds offload-aware permit acquisition logic (threshold-based truncation).
quickwit/quickwit-search/src/root.rs	Minor tracing import/use adjustment.
quickwit/quickwit-search/src/list_terms.rs	Adjusts permit sizing collection (prep for offload-aware behavior).
quickwit/quickwit-search/src/lib.rs	Exposes new `invoker` module and re-exports `LambdaLeafSearchInvoker`.
quickwit/quickwit-search/src/leaf_cache.rs	Minor whitespace cleanup.
quickwit/quickwit-search/src/leaf.rs	Implements local-vs-Lambda scheduling, batching, parallel execution, and merge integration.
quickwit/quickwit-search/src/invoker.rs	Introduces `LambdaLeafSearchInvoker` trait abstraction.
quickwit/quickwit-proto/src/error.rs	Updates error header doc comment wording.
quickwit/quickwit-proto/src/codegen/quickwit/quickwit.search.rs	Adds `LeafSearchResponses` wrapper message to generated code.
quickwit/quickwit-proto/protos/quickwit/search.proto	Adds `LeafSearchResponses` proto definition.
quickwit/quickwit-lambda/README.md	Removes old deprecation stub text.
quickwit/quickwit-lambda-server/src/lib.rs	Defines Lambda server crate exports/modules.
quickwit/quickwit-lambda-server/src/handler.rs	Implements Lambda handler: decode request, run per-split searches, encode responses.
quickwit/quickwit-lambda-server/src/error.rs	Adds Lambda error types and conversions.
quickwit/quickwit-lambda-server/src/context.rs	Builds Lambda-optimized `SearcherConfig` from env and sets caches to `no_cache`.
quickwit/quickwit-lambda-server/src/config.rs	Adds a (currently empty) config stub file.
quickwit/quickwit-lambda-server/src/bin/leaf_search.rs	Provides Lambda binary entrypoint using `lambda_runtime`.
quickwit/quickwit-lambda-server/Cargo.toml	Adds Lambda server crate definition and dependencies/features.
quickwit/quickwit-lambda-client/src/metrics.rs	Adds Prometheus metrics for Lambda invocation and payload sizes.
quickwit/quickwit-lambda-client/src/lib.rs	Exposes deploy/invoker APIs and payload types.
quickwit/quickwit-lambda-client/src/invoker.rs	Implements AWS Lambda invocation + response decoding into per-split responses.
quickwit/quickwit-lambda-client/src/deploy.rs	Implements auto-deploy logic (version discovery/publish + GC).
quickwit/quickwit-lambda-client/build.rs	Downloads and embeds Lambda zip; computes content hash for versioning.
quickwit/quickwit-lambda-client/README.md	Documents Lambda release process and content-based versioning.
quickwit/quickwit-lambda-client/Cargo.toml	Adds Lambda client crate definition and dependencies/build deps.
quickwit/quickwit-config/src/node_config/serialize.rs	Extends serialization tests to cover lambda config.
quickwit/quickwit-config/src/node_config/mod.rs	Adds `LambdaConfig`, `LambdaDeployConfig`, `SearcherConfig.lambda`, and `CacheConfig::no_cache()`.
quickwit/quickwit-config/src/lib.rs	Re-exports lambda config types.
quickwit/quickwit-config/resources/tests/node_config/quickwit.yaml	Adds lambda section to test YAML config.
quickwit/quickwit-config/resources/tests/node_config/quickwit.toml	Adds lambda section to test TOML config.
quickwit/quickwit-config/resources/tests/node_config/quickwit.json	Adds lambda section to test JSON config.
quickwit/quickwit-config/Cargo.toml	Adds `quickwit-common` testsuite feature dep for config tests.
quickwit/quickwit-aws/src/lib.rs	Bumps AWS SDK behavior version used in defaults.
quickwit/Cargo.toml	Adds new crates to workspace members + workspace deps (lambda_runtime, ureq, zip, aws-sdk-lambda, aws-smithy-mocks).
quickwit/Cargo.lock	Locks new dependencies (aws-sdk-lambda, lambda_runtime, ureq, zip, etc.).
docs/configuration/lambda-config.md	Adds end-user documentation for lambda offloading + IAM + deployment.
LICENSE-3rdparty.csv	Updates third-party license list for new deps.
.github/workflows/publish_lambda.yaml	Adds workflow to build and draft-release the Lambda binary zip.
.github/actions/cross-build-binary/action.yml	Pins upload-to-github-release action by commit SHA.
.github/actions/cargo-build-macos-binary/action.yml	Pins upload-to-github-release action by commit SHA.

Comments suppressed due to low confidence (1)

quickwit/quickwit-lambda-server/src/config.rs:17

src/config.rs appears to be an unused stub (not referenced via mod config; and contains only imports). If it’s not needed, it should be removed; if it is intended to host config parsing, wire it up and add the missing implementation.

use anyhow::Context as _;
use bytesize::ByteSize;

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

quickwit/quickwit-lambda-client/build.rs

docs/configuration/lambda-config.md

quickwit/quickwit-search/src/leaf.rs

quickwit/quickwit-lambda-server/src/handler.rs

quickwit/quickwit-lambda-client/build.rs

quickwit/quickwit-lambda-client/src/lib.rs

quickwit/quickwit-search/src/search_permit_provider.rs

quickwit/quickwit-search/src/leaf.rs

docs/configuration/lambda-config.md

The goal is to handle traffic spikes gracefully without provisioning additional searcher nodes: when the local search queue is saturated, overflow splits are transparently routed to Lambda for processing. The offloading decision happens **on the leaf side**, inside the `SearchPermitProvider`. The permit provider already manages a bounded queue of pending split search tasks (gated by memory budget and download slots). When a leaf search request arrives, the provider checks the current queue depth against a configurable `offload_threshold`. If granting permits for all requested splits would exceed this threshold, only enough splits to fill up to the threshold are processed locally — the rest are marked for offloading. The offloaded splits are batched (up to `max_splits_per_invocation` splits per batch, balanced by document count) and sent to Lambda in parallel. Each Lambda invocation runs the same leaf search code path and **returns per-split results individually**. This is important: the per-split responses are fed back into the `IncrementalCollector` and populate the **partial result cache**, so subsequent queries hitting the same splits benefit from cached results regardless of whether the split was searched locally or on Lambda. Depending on the configuration, the Lambda function code can be **deployed automatically** at startup. The `quickwit-lambda-client` crate embeds a compressed Lambda binary at compile time. When `auto_deploy` is configured, Quickwit will: 1. Check if a published Lambda version matching the current binary already exists (identified by a description tag `quickwit:{version}-{hash}`) 2. Create or update the function and publish a new version if needed 3. Garbage-collect old versions (keeping the current one + 5 most recent) This ensures the Lambda function always matches the running Quickwit version without any external deployment tooling. Manual deployment is also supported for users who prefer to manage Lambda functions through Terraform or other IaC tools. Lambda offloading is opt-in. Add a `lambda` section under `searcher` in the node configuration: ```yaml searcher: lambda: offload_threshold: 100 # queue depth before offloading kicks in (0 = always offload) max_splits_per_invocation: 10 auto_deploy: execution_role_arn: arn:aws:iam::123456789012:role/quickwit-lambda-role memory_size: 5 GiB invocation_timeout_secs: 15 ``` - **`quickwit-lambda-client`**: Handles Lambda invocation (with metrics) and auto-deployment logic. Embeds the Lambda binary at build time. - **`quickwit-lambda-server`**: The Lambda function handler itself — receives a `LeafSearchRequest`, runs `multi_index_leaf_search`, and returns per-split `LeafSearchResponse`s. - **`quickwit-search`**: New `LambdaLeafSearchInvoker` trait; `SearchPermitProvider` gains `get_permits_with_offload` to split work between local and offloaded; `leaf.rs` orchestrates local and Lambda tasks in parallel. - **`quickwit-config`**: New `LambdaConfig` and `LambdaDeployConfig` structs under `SearcherConfig`. - **`quickwit-serve`**: Initializes the Lambda invoker at startup when configured. - **`quickwit-proto`**: New `LeafSearchResponses` wrapper message for batched per-split responses. build fix

trinity-1686a

review still in progress

docs/configuration/lambda-config.md

quickwit/quickwit-lambda-client/src/invoker.rs

quickwit/quickwit-lambda-client/build.rs

quickwit/quickwit-lambda-server/src/handler.rs

quickwit/quickwit-search/src/list_terms.rs

quickwit/quickwit-search/src/search_permit_provider.rs

trinity-1686a

i think we should somehow mark this feature as Experimental for the time being (warn on startup when it's enabled maybe?). i suspect there might be changes we'll want to do if/when we add support for other faas provider, in particular on the side of what config looks like.

quickwit/quickwit-lambda-client/src/deploy.rs

quickwit/quickwit-search/src/leaf.rs

trinity-1686a · 2026-02-16T10:57:54Z

quickwit/quickwit-search/src/leaf.rs

-    if let Some(cached_answer) = ctx
-        .searcher_context
-        .leaf_search_cache
-        .get(split.clone(), search_request.clone())
-    {
-        leaf_search_state_guard.set_state(SplitSearchState::CacheHit);
-        return Ok(Some(cached_answer));
-    }


i think this could stay: it could happen that we get a cache miss at the top level, wait for a bit while acquiring a permit, and now the value is there

ok I readded it with a comment explaining the redundancy

quickwit/quickwit-search/src/leaf.rs

docs/configuration/lambda-config.md

fulmicoton · 2026-02-16T17:28:23Z

Refactoring needed:
Lambda should be able to see a single split fail without throwing away all of their work. Results shouldn't need to be ordered. The results should be a list of named (by split) results.

Need to investigate if the reuse of the leaf result protobuf is really mandatory here.

Are we sure merging can happen on the tokio loop?

trinity-1686a · 2026-02-16T17:32:25Z

we're not.

Lambda now details (but does not retry) which split failed or was successful. Leaf cache lambda individual split result, (keep track of the rewritten) request to do so). Lambda handler returned split-id named results in any order.

quickwit/quickwit-search/src/leaf.rs

docs/configuration/lambda-config.md

trinity-1686a · 2026-02-19T10:03:46Z

docs/configuration/lambda-config.md

+You can deploy the Lambda function manually without `auto_deploy`:
+1. Download the Lambda zip from [GitHub releases](https://github.com/quickwit-oss/quickwit/releases)
+2. Create or update the Lambda function using AWS CLI, Terraform, or the AWS Console
+3. Publish a version with description format `quickwit:{version}-{sha1}` (e.g., `quickwit:0_8_0-fa752891`)


is that still accurate now that description contains some configuration params?

trinity-1686a · 2026-02-19T10:04:52Z

quickwit/quickwit-lambda-client/src/deploy.rs

+/// We also pass the deploy config, as we want the function to be redeployed
+/// if the deployed is changed.


Suggested change

/// We also pass the deploy config, as we want the function to be redeployed

/// if the deployed is changed.

/// We also pass the deploy config, as we want the function to be redeployed

/// if the config is changed.

docs/configuration/lambda-config.md

quickwit/quickwit-search/src/search_permit_provider.rs

fulmicoton requested a review from Copilot February 12, 2026 12:51

Copilot started reviewing on behalf of fulmicoton February 12, 2026 12:52 View session

fulmicoton-dd force-pushed the lambda3 branch 2 times, most recently from 63a7c92 to 0441e5f Compare February 12, 2026 12:58

fulmicoton changed the title ~~This PR introduces the ability to offload leaf search work to AWS Lam…~~ Offload leaf search work to AWS Lambda functions Feb 12, 2026

Copilot AI reviewed Feb 12, 2026

View reviewed changes

fulmicoton-dd force-pushed the lambda3 branch 2 times, most recently from e212982 to 94b0a4d Compare February 12, 2026 13:48

fulmicoton-dd marked this pull request as ready for review February 12, 2026 13:48

fulmicoton-dd force-pushed the lambda3 branch from 94b0a4d to 68cf3dd Compare February 12, 2026 13:57

fulmicoton requested a review from trinity-1686a February 12, 2026 13:57

fulmicoton-dd force-pushed the lambda3 branch from 68cf3dd to c335f75 Compare February 13, 2026 11:26

trinity-1686a reviewed Feb 13, 2026

View reviewed changes

trinity-1686a reviewed Feb 16, 2026

View reviewed changes

fulmicoton-dd commented Feb 16, 2026

View reviewed changes

docs/configuration/lambda-config.md Outdated Show resolved Hide resolved

fulmicoton-dd force-pushed the lambda3 branch from caf78e6 to 368af56 Compare February 16, 2026 16:55

fulmicoton-dd added 2 commits February 18, 2026 13:29

CR comment, checking sha256 of downloaded file

7808c54

Changes in the lambda offloading

6627ae7

Lambda now details (but does not retry) which split failed or was successful. Leaf cache lambda individual split result, (keep track of the rewritten) request to do so). Lambda handler returned split-id named results in any order.

fulmicoton-dd force-pushed the lambda3 branch from 45cd513 to 6627ae7 Compare February 18, 2026 12:31

fulmicoton reviewed Feb 19, 2026

View reviewed changes

quickwit/quickwit-search/src/leaf.rs Outdated Show resolved Hide resolved

trinity-1686a self-requested a review February 19, 2026 09:22

trinity-1686a approved these changes Feb 19, 2026

View reviewed changes

CR comments

ad71de4

fulmicoton-dd force-pushed the lambda3 branch from d087be4 to ad71de4 Compare February 19, 2026 10:40

fulmicoton-dd added 2 commits February 19, 2026 12:40

CR comment: using binary heap in greedy_batch_split

c7be288

Adding assert on search permit provider unit test

1127e12

fulmicoton-dd force-pushed the lambda3 branch from 3e0190a to 1127e12 Compare February 19, 2026 12:08

fulmicoton enabled auto-merge (squash) February 19, 2026 12:22

Merge branch 'main' into lambda3

5006df2

fulmicoton merged commit 8412e6d into main Feb 19, 2026
8 checks passed

fulmicoton deleted the lambda3 branch February 19, 2026 12:31

		/// We also pass the deploy config, as we want the function to be redeployed
		/// if the deployed is changed.

Conversation

fulmicoton-dd commented Feb 12, 2026 • edited by fulmicoton Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

How offloading works

Auto-deployment

Configuration

New crates

Key changes in existing crates

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

trinity-1686a left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

trinity-1686a left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

trinity-1686a Feb 16, 2026

Choose a reason for hiding this comment

Uh oh!

fulmicoton Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

fulmicoton commented Feb 16, 2026

Uh oh!

trinity-1686a commented Feb 16, 2026

Uh oh!

Uh oh!

Uh oh!

trinity-1686a Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

trinity-1686a Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

fulmicoton-dd commented Feb 12, 2026 •

edited by fulmicoton

Loading